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We study the evolution of cooperation in the spatial prisoner's dilemma game, where besides unconditional 
cooperation and defection, tit-for-tat, win-stay-lose-shift and extortion are the five competing strategies. 
While pairwise imitation fails to sustain unconditional cooperation and extortion regardless of game 
parametrization, myopic updating gives rise to the coexistence of all five strategies if the temptation to defect 
is sufficiently large or if the degree distribution of the interaction network is heterogeneous. This 
counterintuitive evolutionary outcome emerges as a result of an unexpected chain of strategy invasions. 
Firstly, defectors emerge and coarsen spontaneously among players adopting win-stay-lose-shift. Secondly, 
extortioners and players adopting tit-for-tat emerge and spread via neutral drift among the emerged 
defectors. And lastly, among the extortioners, cooperators become viable too. These recurrent evolutionary 
invasions yield a five-strategy phase that is stable irrespective of the system size and the structure of the 
interaction network, and they reveal the most unexpected mechanism that stabilizes extortion and 
cooperation in an evolutionary setting. 

Widespread cooperation in nature is one of the most important challenges to Darwin's theory of 
evolution and natural selection, but it is also the main driving force behind the evolutionary transitions 
that led from single-cell organisms to complex animal and human societies 1 . And it appears to be this 
mixture of a fascinating riddle and outmost importance that makes cooperation so irresistibly attractive to study. 
Evolutionary game theory 2 " 6 is thereby the most frequently employed theoretical framework, revealing mechan- 
isms such as kin selection 7 , network reciprocity 8 , direct and indirect reciprocity 910 , as well as group selection 11 as 
potent promoters of cooperative behavior. Adding to these established five rules for the evolution of coopera- 
tion 12 , recent years have witnessed a surge of predominantly interdisciplinary studies, linking together knowledge 
from biology, sociology, economics as well as mathematics and physics, to identify new ways by means of which 
the successful evolution of cooperation amongst selfish and unrelated individuals can be understood 13 " 20 . 

From the large array of games that make up evolutionary game theory, none has received as much attention as 
the prisoner's dilemma game 21 " 35 . Each instance of the game is contested by two players who have to decide 
simultaneously whether they want to cooperate or defect. The dilemma is given by the fact that although mutual 
cooperation yields the highest collective payoff, a defector will do better if the opponent decides to cooperate. The 
rational outcome is thus mutual defection. The popularity of the game was helped significantly by the tourna- 
ments that were organized by Robert Axelrod 36 , where the most successful strategy for the iterated prisoner's 
dilemma game was sought. Interestingly the long-term winner was the tit-for-tat strategy by the simple and 
intuitive virtue of always following the opponent's previous action. However, tit-for-tat cannot correct erroneous 
moves, and it is also vulnerable to random drift when mutant strategies appear which always cooperate 37 . Nowak 
and Sigmund therefore proposed win-stay-lose-shift as another equally simple strategy that has neither of these 
two disadvantages, and can outperform tit-for-tat in the prisoner's dilemma game 22 . Players adopting win-stay- 
lose-shift simply repeat the previous move if the resulting payoff has met their aspiration level and change 
otherwise. 

The simplicity and effectiveness of strategies like tit-for-tat and win-stay-lose-shift were unrivaled for decades, 
and they generated a large following of the seminal works that introduced them. Recently, however, Press and 
Dyson have introduced a new class of so-called zero -determinant strategies that can dominate any opponent in 
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the iterated prisoner's dilemma game 38 . A particularly interesting 
subset of the class are extortion strategies, which ensure that an 
increase in one's own payoff exceeds the increase in the other player's 
payoff by a fixed percentage. Extortion is therefore able to dominate 
any opponent 39 . But this holds only if players are unable to change 
strategies in response to their failures. In an evolutionary setting, 
where players are able to imitate strategies that are more successful, 
extortion was shown to be evolutionary unstable 40 . If the two players 
engaged in the game belong to distinct populations, or if the popu- 
lation size is very small, on the other hand, extortioners can never- 
theless prevail, and rather counterintuitively, they may also act as 
catalysts for the evolution of cooperation 41 . Evolutionary stability can 
also be warranted by generous zero -determinant strategies through 
their mutually supporting behavior 42 . 

Results summarized thus far concerning zero -determinant strat- 
egies were obtained in well-mixed populations. Yet it is well- 
known that stable solutions in structured population can differ 
significantly from those in well-mixed populations. The most 
prominent example of this fact is the successful evolution of coop- 
eration in the spatial prisoner's dilemma game through network 
reciprocity 8 . Further examples include the stabilization of reward 43 , 
peer and pool punishment 44,45 , in-group favoritism 46 , as well as 
homophily 47 , to name but a few. Indeed, the fact that the interac- 
tions among players are frequently not random and best described 
by a well-mixed model, but rather that they are limited to a set of 
other players in the population and as such are best described by a 
network, has far-reaching consequences for the outcome of evolu- 
tionary processes 13,15 ' 1618 ' 19 . 

Motivated by this, we have recently shown that in structured 
populations the microscopic dynamic that governs strategy updating 
plays a decisive role for the fate of extortioners 48 . By using the sim- 
plest three -strategy model, comprising cooperators (C), defectors 
(D), and extortioners (E x ), we have shown that pairwise imitation 
and birth-death dynamics return the same evolutionary outcomes as 
reported previously in well-mixed populations. The usage of myopic 
best response strategy updating, on the other hand, renders extortion 
evolutionary stable via neutral drift. Counterintuitively, the stability 
of extortioners helps cooperators to survive even under the most 
testing conditions, whereby the neutral drift of E x players serves as 
the entry point, akin to a Trojan horse, for cooperation to grab a hold 
among defectors. Although the mutually rewarding checkerboard- 
like coexistence of cooperators and extortioners can always be tem- 
porarily disturbed by defectors, it is only a matter of time before the 
neutral drift reintroduce extortioners and the whole cycle starts 
anew. 

Here we extend our study to five competing strategies, taking 
into account also the tit-for-tat strategy (TFT) and the win-stay- 
lose-shift strategy (WSLS), in addition to the previous three that 
we have studied in 48 . The five strategies D, C, E r TFT, and WSLS 
are the same as studied recently by Hilbe et al. 41 in well-mixed 
populations, with the strength of the social dilemma b and the 
strength of exploitation % being the two main parameters that 
determine the payoffs amongst the strategies. For details about 
the parametrization of the game and the applied updating rules, 
we refer to the Methods section. The inclusion of the tit-for-tat 
strategy and the win- stay-lose- shift strategy promises fascinating 
evolutionary outcomes, especially since under well-mixed condi- 
tions D can beat WSLS, but the dominance reverses in the pre- 
sence of the other three strategies. As we will show in the next 
Section, in structured populations WSLS dominate completely for 
sufficiently small values of b if the interaction network is charac- 
terized by a homogeneous degree distribution. Beyond a threshold 
value of b, or if the interaction network is characterized by a 
heterogeneous degree distribution (see for example 49 ), however, 
D emerge and coarsen spontaneously, which in turn opens up 
the possibility for all the other strategies to emerge as well. 
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Figure 1 | Imitation on a square lattice fails to sustain cooperation and 
extortion. Depicted are the stationary frequencies of surviving strategies in 
dependence on the strength of the social dilemma b. It can be observed that 
for sufficiently small values of b only WSLS survive. As b increases, the pure 
WSLS phase first gives way to a narrow two-strategy WSLS + D phase, 
which then transforms into the three-strategy WSLS + TFT + D phase. The 
emergence of these three different phases is a direct consequence of 
dominance relations between the three involved strategies, which are 
schematically depicted in the bottom frame for the respective values of b 
from left to right. Arrows show the direction of invasion between strategies. 

Results 

Before turning to the main results obtained with myopic best res- 
ponse updating, we present in Fig. 1 the evolutionary outcomes 
obtained via imitation on a square lattice. If imitation is the basis 
of strategy updating, then neither cooperators nor extortioners can 
survive, and this regardless of the strength of the social dilemma and 
the strength of exploitation. Since extortioners always die out, the 
composition of the final state is actually completely independent of X- 
We have used % = 1.5 for the presented results, but the value influ- 
ences only the time needed for relaxation towards the final stable 
solution. Starting with b > 1 (we show results from b = 1.5 onwards 
for clarity with regards to the subsequent phase transitions), the 
completely dominant strategy is WSLS. At the other end of the inter- 
val of b, we have a stable three-strategy WSLS + TFT + D phase, 
which is sustained by cyclic dominance. In between, we have a nar- 
row two -strategy WSLS + D phase, which terminates immediately 
after D reach dominance. 

This dependence on b can be understood by considering the rela- 
tions among the surviving strategies, as summarized in the bottom 
frame of Fig. 1. For small values of b (left), WSLS dominate both D 
and TFT. The latter also dominate D, but their superior status in this 
relationship has no effect on the final state. For high values of b 
(right), the direction of invasion between WSLS and D changes 
compared to the low b case, while the other two relations remain 
unchanged. Consequently, instead of a pure WSLS phase, we have a 
three- strategy WSLS + TFT + D phase, where WSLS invade TFT, 
TFT invade D, and D invade WSLS to close the loop of dominance. It 
is worth emphasizing that this solution is impossible in a well-mixed 
population for all b < 2. 

In a narrow interval between the pure WSLS phase and the cyclic 
WSLS + TFT + D phase, we have the situation depicted in the 
middle of the bottom frame of Fig. 1, where unlike for small and 
high values of b, the relation between WSLS and D enables their 
coexistence in a structured population. As for small values of b, here 
too TFT can invade D, but this is without effect on the final outcome. 
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Figure 2 | The coexistence of defectors and players adopting the win-stay- 
lose-shift strategy in case of imitation on a square lattice. Depicted is the 
time evolution of the frequency of defectors f D as obtained for b = 1.7, 
1.734, 1.736, 1.738, 1.739 and 1.741 from bottom to top. The time courses 
provide insight into the competition for space within the narrow two- 
strategy WSLS + D phase that can be observed in Fig. 1. At b = 1.741 
defectors come to dominate the whole population, but their dominance is 
immediately overthrown in favor of the three- strategy WSLS + TFT + D 
phase that is sustained by cyclic dominance. The used linear size of the 
square lattice is L = 1000. Note that the time scale is logarithmic. 

The stable two-strategy coexistence is illustrated in Fig. 2, where we 
show how WSLS and D compete for space over time for different 
values of b. The larger the value of b, the smaller the fraction of the 
population that is occupied by WSLS in the stationary state. 
Interestingly, when b is large enough for D to fully eliminate 
WSLS, the complete dominance of defectors is prevented by the 
presence of TFT, who become viable via a second-order continuous 
phase transition. From this point onwards, the cyclic dominance 
WSLS -> TFT -> D -> WSLS starts working until the end of the 
interval of b, as depicted in the main panel of Fig. 1. 

Overall, extortion is unable to capitalize on structured interactions 
if the strategy updating is governed by imitation or a birth- death rule 
(results not shown), and in fact this is in full qualitative agreement 
with the results obtained in well-mixed populations 40 ' 41 . In the realm 
of evolutionary games, extortioners do not do well against coopera- 
tive strategies like C, TFT and WSLS. They may thrive for a short 
period of time, but as soon extortion becomes widespread, it is more 
profitable to cooperate, which ultimately renders extortion evolu- 
tionary unstable. 

Myopic strategy updating, on the other hand, can sustain very 
different evolutionary outcomes as it allows players to adopt strat- 
egies that are not necessarily present in their interaction neighbor- 
hood. In fact, strategies need not be present in the population at all, as 
long as they are an option for the players to choose randomly when it 
is their turn to perhaps change their strategy. Nevertheless, we 
emphasize that myopic best response updating is different from 
mutation, because each individual strategy change is still driven by 
the payoff difference, as described by Eq. 1 . Results presented in Fig. 3 
obtained on the square lattice (top) and the random regular graph 
(middle) show that for sufficiently small values of b the final state is 
the same as under imitation dynamics. Players adopting WSLS dom- 
inate completely from b = 1 onwards (as in Fig. 1, we show results for 
b > 1.5 only). At a critical value of b, however, a second-order 
continuous phase transition rather unexpectedly leads to the stable 
coexistence of all five competing strategies. A similar diversity of 
strategies prevails on heterogeneous interaction networks, as illu- 
strated by the results obtained on a scale-free network shown in 
the bottom panel of Fig. 3. Myopic best response updating is thus 
able to stabilize extortion in structured populations. Perhaps even 
more surprisingly, as the strength of the social dilemma increases, the 



two cooperative strategies C and TFT become viable as well. This out- 
come is rather independent of the structure of the interaction network. 

Since extortioners survive for sufficiently high values of b, the 
strength of extortion x might play a role too, but as evidenced by 
the results presented in Fig. 4, this role is in fact very minor. As the 
value of x increases, the extortioners become slightly more common 
on the expense of TFT and C players, but overall this does not affect 
the evolutionary stability of extortion and cooperation. Compared to 
our previous results presented in 48 , where we have studied the three 
strategy variant of the game without TFT and WSLS players, the role 
of x is less significant here mainly because the stationary frequency of 
extortioners is much smaller. The fact that their frequency is much 
smaller, however, is a direct consequence of the presence of the two 
additional cooperative strategies (TFT and WSLS), which in turn 
highlights the general subordinate role of extortioners compared to 
cooperation in evolutionary games. The latter was emphasized 
already in 40 ' 41 , as well as by the results presented in Fig. 1 above. 
Also contributing to the minor role of % is that the emergence of 
extortioners is in fact a second-order effect, as we will explain next. 

To understand why E y , TFT and C emerge as b increases, it is 
instructive to consider the erosion of the pure WSLS phase on square 
lattice, as illustrated in Fig. 5. For a sufficiently high value of b defec- 
tors emerge and start coarsen spontaneously because their payoff 
becomes competitive with the payoff of aggregated WSLS players. 
The emergence of the D phase, however, paves the way for the emer- 
gence of all the other strategies. Namely, both E y and TFT are neutral 
against D, and thus they may emerge by chance and spread via 
neutral drift. As E y accumulate locally, C become viable too because 
their payoff is higher. The emergence of C is helped further (or at 
least not hindered) by TFT, who are neutral with C. During this 
unexpected chain of strategy invasions, defection and extortion thus 
emerge as catalysts of unconditional cooperation. Effectively, the 
defectors act as a Trojan horse for all the other strategies, while 
subsequently the extortioners act as a Trojan horse for cooperation. 
Evidently, the spreading of C, which utilizes the neutral drift of E x , 
will be controlled by defectors and WSLS players who can strike back 
since their presence in place of an extortioner may yield a higher 
payoff in a predominantly cooperative neighborhood. This, however, 
will again be only temporary, since the described elementary inva- 
sions are bound to recur, thus assuring the stability of the five- 
strategy WSLS + D + E x + TFT + C phase. 

An important lesson learned from the presented results in Fig. 5 is 
that although extortion can be as counterproductive as defection, it is 
still less destructive. For an unconditional cooperator it never pays 
sticking with the strategy if surrounded by defectors, but it may be the 
best option among extortioners. Cooperators are of course happiest 
among other cooperators, but in the presence of extortioners they 
can still attain a positive payoff, and this is much better than nothing 
or a negative value in the presence of defectors. It is worth emphas- 
izing that this argument is valid independently of the properties of 
the interaction network, as the described chain of strategy invasions 
emerges in all the structured populations that we have considered. 

Discussion 

We have shown that even if the set of competing strategies is 
extended to encompass, besides unconditional cooperators, defec- 
tors and extortioners 48 , also the tit-for-tat strategy and the win-stay- 
lose-shift strategy, the imitation dynamics in structured populations 
is still unable to render extortion evolutionary stable. For sufficiently 
small values of b only players adopting the win -stay-lose -shift strat- 
egy survive, while beyond a threshold value a stable three-strategy 
phase consisting of defectors, tit-for-tat and win-stay-lose-shift 
players emerges. Since extortioners never survive, the strength of 
exploitation % is without effect. These results agree with those 
reported previously for sizable isolated well-mixed populations 41 , 
and they highlight the severe challenges that extortioners face when 
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Figure 3 | Myopic best response updating in structured populations 
stabilizes extortion and cooperation. Depicted are the stationary 
frequencies of surviving strategies in dependence on the strength of the 
social dilemma b, as obtained for the strength of extortion x = 1.5 on the 
square lattice (top), the random regular graph (middle), and the scale-free 
network (bottom). It can be observed that players adopting the WSLS 
strategy dominate for sufficiently small values of b on homogeneous 
interaction networks (top and middle), but as b increases or if the 
interaction network is heterogeneous (bottom) , the pure WSLS phase gives 
way to a stable five-strategy WSLS + D + E x + TFT + C phase. Here 
defectors emerge and coarsen spontaneously because for sufficiently large 
values of b their payoff becomes larger than that of clustered WSLS players. 
The emergence of defectors immediately opens the door to the survival of 
extortioners and TFT players, which both emerge by chance and spread by 
means of neutral drift. Lastly, with the emergence of extortioners and TFT 
players cooperators become viable as well, thus forming the stable five- 
strategy phase. The latter is virtually unaffected by different values of as 



demonstrated in Fig. 4. Importantly, the described coexistence of the 
competing strategies is a universal behavior that can be observed in 
structured populations regardless of the properties of the interaction 
network, and even across the whole span of b values, as illustrated in the 
bottom panel. Characteristic snapshots depicting the described key stages 
of the evolutionary process are presented in Fig. 5. 

vying for survival in the realm of evolutionary games where players 
are able to imitate strategies that are performing better 40 . 

If the evolution is governed by myopic best response updating, 
however, the outcomes are significantly different from those 
obtained via imitation. We have shown that for sufficiently large 
values of b the complete dominance of win-stay-lose-shift players 
is broken as soon as defectors emerge and start coarsening. 
Subsequently, within the homogeneous domains of defectors, extor- 
tion becomes viable too via the same mechanism as we have 
described before in 48 . In particular, extortioners and defectors are 
neutral, and hence the former can emerge by chance and spread 
via neutral drift. Yet as soon as extortioners emerge, cooperators 
can finally emerge as well, because in competition with the former 
they are superior. In this evolutionary scenario, defection and extor- 
tion thus act as the most surprising catalysts of unconditional coop- 
eration in structured populations. Moreover, we have shown that the 
coexistence of all competing strategies occurs across the whole inter- 
val of b values if a heterogeneous (scale-free) network describes the 
interactions among players. Because of this unlikely path towards 
cooperation, we conclude that defectors and extortioners effectively 
play the role of a Trojan horse for cooperators. Interestingly, similar 
transient roles of extortionate behavior were recently reported in the 
realm of well-mixed populations when studying the adaptive 
dynamics of extortion and compliance 50 . Moreover, after the emer- 
gence and coarsening of defectors, in the presently studied game the 
tit-for-tat players also become viable as they are likewise neutral, and 
can thus spread via neutral drift just like extortioners. In recurrence, 
these evolutionary processes give rise to a stable five-strategy phase 
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Figure 4 | The strength of extortion has a negligible impact on the 
stationary frequencies of competing strategies, and it does not affect the 
evolutionary stability of extortion and cooperation. Depicted are the 
stationary frequencies of surviving strategies in dependence on the 
strength of extortion % y as obtained for the social dilemma strength b = 2 
on a square lattice. It can be observed that the variations of all frequencies 
are small. Expectedly, larger values of % favor extortion. The neutral drift of 
TFT players therefore becomes slightly less prolific, which in turn also 
slightly decreases the frequency of cooperators. Interestingly, the 
stationary frequencies of strategies at b = 2 and their ^-dependency are 
practically indistinguishable for the square lattice and the random regular 
graph. This further highlights the irrelevance of the structure of the 
interaction network under myopic best response updating, and thus also 
the universality of the presented results. 
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Figure 5 | Characteristic time evolution of the spatial distribution of the five competing strategies on a square lattice. The evolution starts from the full 
WSLS phase (not shown), using b = 1.8 and % = 1.5. At MCS = 5 (leftmost panel), first defectors start emerging because their payoff is comparable with 
WSLS players. Soon thereafter, at MCS = 10 (second panel from left), first extortioners and TFT players emerge. Both have neutral relations 
with the defectors, and thus their emergence and spreading are due to chance and neutral drift. At MCS = 30, as soon as locally the number of extortioners 
becomes sufficiently large, cooperators emerge as well due to their higher payoffs, and their spreading is additionally supported by the TFT players. The 
recurrence of these elementary processes eventually spreads a stable mixture of all five strategies across the whole population, as depicted in the rightmost 
panel that was taken at MCS = 100. The color encoding of the strategies is the same as used in Figs. 3 and 4. For clarity with regards to individual 
players and their strategies, we have used a small square lattice with linear size L = 40. 



that is hardly affected by the strength of exploitation and it is also 
robust to the population size and the structure of the interaction 
network. 

Taken together, these results thus have a high degree of universal- 
ity and highlight the relevance of coarsening, the emergence of role- 
separating strategy distributions (which manifests as checkerboard 
ordering on regular graphs), and best response updating in evolu- 
tionary games. The latter is especially important, as it appears to be 
an integral part of human behavior 51 " 53 . From the more pragmatical 
point of view, best response updating conveys to the players an ability 
to explore the space of available strategies even if they are not present 
in their immediate neighborhood or even in the population as a 
whole, and by doing so, such updating dynamics opens up the door 
to the most counterintuitive evolutionary outcomes. Similarly to kin 
competition, the presented results also highlight the other side of 
network reciprocity. Namely, it does not only support cooperative 
behavior by means of clustering, but it also reveals the consequences 
of bad decisions - defectors and extortioners become weak when they 
become surrounded by their like. From this point of view, it is under- 
standable and indeed expected that structured populations, if any- 
thing, hinder the successful evolution of extortion under imitation. 
The surprising positive role of extortioners becomes apparent only 
under best response updating, where the threatening loom of wide- 
spread defection is drifted away by the lesser evil to eventually intro- 
duce more constructive cooperative strategies. 



so-called donation game, which is an important special case of the iterated prisoner's 
dilemma game with all the original properties retained 54 . 

We predominantly consider a L X L square lattice with periodic boundary con- 
ditions as the simplest interaction network to describe a structured population. To 
demonstrate the robustness of our findings, we also use a random regular graph and 
the scale-free network with the same average degree, which is likely somewhat more 
apt to describe realistic social and technological networks 55 . We have used population 
sizes from 10 4 up to 10 6 players to avoid finite-size effects. 

Unless stated differently, for example to illustrate a specific invasion process as in 
Fig. 5, we use random initial conditions such that all five strategies are uniformly 
distributed across the network. We carry out Monte Carlo simulations comprising the 
following elementary steps. First, a randomly selected player x with strategy s x 
acquires its payoff p x by playing the game with its k neighbors, as specified by the 
underlying interaction network. Next, player x changes its strategy s x to s' x with the 
probability 

q(s' x ^>s x ) = rr ^ , x / -i (1) 

where p' x is the payoff of the same player if adopting strategy s' x within the same 
neighborhood, and K = 0.05 quantifies a small uncertainty that is related to the 
strategy adoption process 15 . The strategy s' x should of course be different from s x , and 
it is drawn randomly from the remaining four strategies. Such strategy updating is 
known as the myopic best response rule 51 . 

We also consider the more traditional strategy imitation, where player x imitates 
the strategy of a randomly selected neighbor y, only that p' x in Eq. 1 is replaced byp y 15 , 
as well as death-birth updating as described for example in 56 . Regardless of the applied 
strategy updating rule, we let the system evolve towards the stationary state where the 
average frequency of strategies becomes time independent. We measure time in full 
Monte Carlo steps (MCS), during which each player is given a chance to change its 
strategy once on average. 



Methods 

We adopt the same game parametrization as Hilbe et al. 41 . Accordingly, the payoff 
matrix for the five competing strategies is 
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where b is the benefit to the other player provided by each cooperator at the cost c, and 
X determines the surplus of the extortioner in relation to the surplus of the other 
player. Moreover, we use b — c = 1, thus having b > 1 and x > 1 as the two main 
parameters. The former determines the strength of the social dilemma, while the latter 
determines just how strongly strategy E y exploits cooperators. A direct comparison of 
the extortioner strategy with the other strategies reveals that E y is neutral with 
unconditional defectors and players adopting the TFT strategy. The latter, however, 
may beat E x if they are surrounded by other TFT players. Similar relations hold for the 
competition between E y and WSLS players. While the latter receive the same income 
from a direct interaction, they do gain more if the neighbors also adopt the WSLS 
strategy. It is also worth noting that the payoffs between C and D constitute the 



1. Maynard Smith, J. & Szathmary, E. The Major Transitions in Evolution (W. H. 
Freeman & Co, Oxford, 1995). 

2. Maynard Smith, J. Evolution and the Theory of Games (Cambridge University 
Press, Cambridge, U.K., 1982). 

3. Weibull, J. W. Evolutionary Game Theory (MIT Press, Cambridge, MA, 1995). 

4. Hofbauer, J. & Sigmund, K. Evolutionary Games and Population Dynamics 
(Cambridge University Press, Cambridge, U.K., 1998). 

5. Mesterton- Gibbons, M. An Introduction to Game-Theoretic Modelling, 2nd 
Edition (American Mathematical Society, Providence, RI, 2001). 

6. Nowak, M. A. Evolutionary Dynamics (Harvard University Press, Cambridge, 
MA, 2006). 

7. Hamilton, W. D. Genetical evolution of social behavior I. /. Theor. Biol. 7, 1-16 
(1964). 

8. Nowak, M. A. & May, R. M. Evolutionary games and spatial chaos. Nature 359, 
826-829 (1992). 

9. Trivers, R. L. The evolution of reciprocal altruism. Q. Rev. Biol. 46, 35-57 (1971). 

10. Axelrod, R. & Hamilton, W. D. The evolution of cooperation. Science 211, 
1390-1396 (1981). 

11. Wilson, D. S. Structured demes and the evolution of group-advantageous traits. 
Am. Nat. Ill, 157-185 (1977). 

12. Nowak, M. A. Five rules for the evolution of cooperation. Science 314, 1560-1563 

(2006) . 

13. Doebeli, M. & Hauert, C. Models of cooperation based on prisoner's dilemma and 
snowdrift game. Ecol. Lett. 8, 748-766 (2005). 

14. Sigmund, K. Punish or perish? retailation and collaboration among humans. 
Trends Ecol. Evol. 22, 593-600 (2007). 

15. Szabo, G. & Fath, G. Evolutionary games on graphs. Phys. Rep. 446, 97-216 

(2007) . 



SCIENTIFIC REPORTS | 4:5496 | DOI: 1 0.1 038/srep05496 



5 



16. Roca, C. P., Cuesta, J. A. & Sanchez, A. Evolutionary game theory: Temporal and 
spatial effects beyond replicator dynamics. Phys. Life Rev. 6, 208-249 (2009). 

17. Schuster, S., Kreft, J.-U., Schroeter, A. & Pfeiffer, T. Use of game-theoretical 
methods in biochemistry and biophysics. /. Biol. Phys. 34, 1-17 (2008). 

18. Perc, M. & Szolnoki, A. Coevolutionary games - a mini review. BioSy stems 99, 
109-125 (2010). 

19. Perc, M., Gomez- Gar denes, J., Szolnoki, A. and Fiona and Y. & Moreno, L. M. 
Evolutionary dynamics of group interactions on structured populations: a review. 
/. R. Soc. Interface 10, 20120997 (2013). 

20. Rand, D. A. & Nowak, M. A. Human cooperation. Trends Cogn. Sci. 17, 413-425 
(2013). 

21. Fudenberg, D. & Maskin, E. The folk theorem in repeated games with discounting 
and incomplete information. Econometrica 54, 533-554 (1986). 

22. Nowak, M. A. & Sigmund, K. A strategy of win-stay, lose-shift that outperforms 
tit-for-tat in the prisoner's dilemma game. Nature 364, 56-58 (1993). 

23. Santos, F. C. & Pacheco, J. M. Scale-free networks provide a unifying framework 
for the emergence of cooperation. Phys. Rev. Lett. 95, 098104 (2005). 

24. Imhof, L. A., Fudenberg, D. & Nowak, M. A. Evolutionary cycles of cooperation 
and defection. Proc. Natl. Acad. Sci. USA 102, 10797-10800 (2005). 

25. Santos, F. C., Pacheco, J. M. & Lenaerts, T. Evolutionary dynamics of social 
dilemmas in structured heterogeneous populations. Proc. Natl. Acad. Sci. USA 
103, 3490-3494 (2006). 

26. Tanimoto, J. Dilemma solving by coevolution of networks and strategy in a 2 X 2 
game. Phys. Rev. E 76, 021126 (2007). 

27. Fu, F., Liu, L.-H. & Wang, L. Evolutionary prisoner's dilemma on heterogeneous 
Newman-Watts small-world network. Eur. Phys. J. B 56, 367-372 (2007). 

28. Gomez- Gar denes, J., Campillo, M., Fiona, L. M. & Moreno, Y. Dynamical 
organization of cooperation in complex networks. Phys. Rev. Lett. 98, 108103 
(2007). 

29. Poncela, J., Gomez- Gar denes, J., Fiona, L. M. & Moreno, Y. Robustness of 
cooperation in the evolutionary prisoner's dilemma on complex systems. New J. 
Phys. 9, 184 (2007). 

30. Fu, F., Hauert, C., Nowak, M. A. & Wang, L. Reputation-based partner choice 
promotes cooperation in social networks. Phys. Rev. E 78, 026117 (2008). 

31. Poncela, J., Gomez- Gar denes, J., Fiona, L. M., Moreno, Y. & Sanchez, A. 
Cooperative scale-free networks despite the presence of defector hubs. EPL 88, 
38003 (2009). 

32. Antonioni, A. & Tomassini, M. Network fluctuations hinder cooperation in 
evolutionary games. PLoS ONE 6, e25555 (201 1). 

33. Tanimoto, J., Brede, M. & Yamauchi, A. Network reciprocity by coexisting 
learning and teaching strategies. Phys. Rev. E 85, 032101 (2012). 

34. Gracia-Lazaro, C., Cuesta, J., Sanchez, A. & Moreno, Y. Human behavior in 
prisoner's dilemma experiments suppresses network reciprocity. Sci. Rep. 2, 325 
(2012). 

35. Gracia-Lazaro et al. Heterogeneous networks do not promote cooperation when 
humans play a prisoner's dilemma. Proc. Natl. Acad. Sci. USA 109, 12922-12926 
(2012). 

36. Axelrod, R. The Evolution of Cooperation (Basic Books, New York, 1984). 

37. Imhof, L. A., Fudenberg, D. & Nowak, M. A. Tit-for-tat or win-stay, lose-shift? 
/. Theor. Biol. 247, 574-580 (2007). 

38. Press, W. & Dyson, F. Iterated prisoner's dilemma contains strategies that 
dominate any evolutionary opponent. Proc. Natl. Acad. Sci. USA 109, 
10409-10413 (2012). 

39. Stewart, A. J. & Plotkin, J. B. Extortion and cooperation in the prisoner's dilemma. 
Proc. Natl. Acad. Sci. USA 109, 10134-10135 (2012). 

40. Adami, C. & Hintze, A. Evolutionary instability of zero -determinant strategies 
demonstrates that winning is not everything. Nat. Commun. 4, 2193 (2013). 

41. Hilbe, C, Nowak, M. & Sigmund, K. Evolution of extortion in iterated prisoner's 
dilemma games. Proc. Natl. Acad. Sci. USA 110, 6913-6918 (2013). 



42. Stewart, A. J. & Plotkin, J. B. From extortion to generosity, evolution in the iterated 
prisoner's dilemma. Proc. Natl. Acad. Sci. USA 110, 15348-15353 (2013). 

43. Szolnoki, A. & Perc, M. Reward and cooperation in the spatial public goods game. 
EPL 92, 38003 (2010). 

44. Helbing, D., Szolnoki, A., Perc, M. & Szabo, G. Evolutionary establishment of 
moral and double moral standards through spatial interactions. PLoS Comput. 
Biol. 6, el000758 (2010). 

45. Szolnoki, A., Szabo, G. & Perc, M. Phase diagrams for the spatial public goods 
game with pool punishment. Phys. Rev. E 83, 036101 (2011). 

46. Fu, F., Tarnita, C, Christakis, N., Wang, L., Rand, D. & Nowak, M. Evolution of in- 
group favoritism. Sci. Rep. 2, 460 (2012). 

47. Fu, F., Nowak, M., Christakis, N. & Fowler, J. The evolution of homophily. Sci. 
Rep. 2, 845 (2012). 

48. Szolnoki, A. & Perc, M. Evolution of extortion in structured populations. Phys. 
Rev. £89,022804 (2014). 

49. Santos, F. C, Santos, M. D. & Pacheco, J. M. Social diversity promotes the 
emergence of cooperation in public goods games. Nature 454, 213-216 (2008). 

50. Hilbe, C, Nowak, M. & Traulsen, A. Adaptive dynamics of extortion and 
compliance. PLoS ONE 8, e77886 (2013). 

5 1 . Matsui, A. Best response dynamics and socially stable strategies. /. Econ. Theor. 57, 
343-362 (1992). 

52. Blume, L. E. The Statistical Mechanics of Strategic Interactions. Games Econ. 
Behav. 5, 387-424 (1993). 

53. Ellison, G. Learning, Local Interaction, and Coordination. Econometrica 61, 
1047-1071 (1993). 

54. Brede, M. Short Versus Long Term Benefits and the Evolution of Cooperation in 
the Prisoner's Dilemma Game. PLoS ONE 8, e56016 (2013). 

55. Barabasi, A.-L. & Albert, R. Emergence of scaling in random networks. Science 
286, 509-512 (1999). 

56. Ohtsuki, H. & Nowak, M. A. The replicator equation on graphs. /. Theor. Biol. 243, 
86-97 (2006). 



Acknowledgments 

This research was supported by the Hungarian National Research Fund (Grant K- 101490), 
TAMOP-4.2.2.A-11/1/KONV-2012-0051, and the Slovenian Research Agency (Grants 
Jl-4055 and P5-0027). 

Author contributions 

A.S. and M.P. designed and performed the research as well as wrote the paper. 

Additional information 

Competing financial interests: The authors declare no competing financial interests. 

How to cite this article: Szolnoki, A. & Perc, M. Defection and extortion as unexpected 
catalysts of unconditional cooperation in structured populations. Sci. Rep. 4, 5496; 
DOI:10.1038/srep05496 (2014). 

^ I This work is licensed under a Creative Commons Attribution 4.0 International 
k^^KS^H License. The images or other third party material in this article are included in the 
article's Creative Commons license, unless indicated otherwise in the credit line; if 
the material is not included under the Creative Commons license, users will need 
to obtain permission from the license holder in order to reproduce the material. To 
view a copy of this license, visit http://creativecommons.Org/licenses/by/4.0/ 



SCIENTIFIC REPORTS | 4:5496 | DOI: 1 0.1 038/srep05496 



6 



